Automatic ambiguity detection

نویسندگان

  • Richard Sproat
  • Jan P. H. van Santen
چکیده

Most work on sense disambiguation presumes that one knows beforehand — e.g. from a thesaurus — a set of polysemous terms. But published lists invariably give only partial coverage. For example, the English word tan has several obvious senses, but one may overlook the abbreviation for tangent. In this paper, we present an algorithm for identifying interesting polysemous terms and measuring their degree of polysemy, given an unlabeled corpus. The algorithm involves: (i) collecting all terms within a k-term window of the target term; (ii) computing the inter-term distances of the contextual terms, and reducing the multi-dimensional distance space to two dimensions using standard methods; (iii) converting the two-dimensional representation into radial coordinates and using isotonic/antitonic regression to compute the degree to which the distribution deviates from a single-peak model. The amount of deviation is the proposed polysemy index.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Ambiguity Detection to Streamline Linguistic Annotation

Arabic writing is typically underspecified for short vowels and other markups, referred to as diacritics. In addition to the lexical ambiguity exhibited in most languages, the lack of diacritics in written Arabic adds another layer of ambiguity which is an artifact of the orthography. In this paper, we present the details of three annotation experimental conditions designed to study the impact ...

متن کامل

Automatic Term Ambiguity Detection

While the resolution of term ambiguity is important for information extraction (IE) systems, the cost of resolving each instance of an entity can be prohibitively expensive on large datasets. To combat this, this work looks at ambiguity detection at the term, rather than the instance, level. By making a judgment about the general ambiguity of a term, a system is able to handle ambiguous and una...

متن کامل

Ambiguity Detection: Scaling to Scannerless

Static ambiguity detection would be an important aspect of language workbenches for textual software languages. However, the challenge is that automatic ambiguity detection in context-free grammars is undecidable in general. Sophisticated approximations and optimizations do exist, but these do not scale to grammars for so-called “scannerless parsers”, as of yet. We extend previous work on ambig...

متن کامل

Ambiguity Detection: Scaling towards Scannerless

Static ambiguity detection would be an important aspect of language workbenches for textual software languages. The challenge is that automatic ambiguity detection of context-free grammars is undecidable. Sophisticated approximations and optimizations do exist, but these do not scale to grammars for so-called “scannerless parsers”, as of yet. We extend previous work on ambiguity detection for c...

متن کامل

Reducing Light Change Effects in Automatic Road Detection

Automatic road extraction from aerial images can be very helpful in traffic control and vehicle guidance systems. Most of the road detection approaches are based on image segmentation algorithms. Color-based segmentation is very sensitive to light changes and consequently the change of weather condition affects the recognition rate of road detection systems. In order to reduce the light change ...

متن کامل

Automatic Detection and Resolution of Lexical Ambiguity in Process Models (Extended Abstract)

Process models play an important role in various system-related management activities including requirements elicitation, domain analysis, software design as well as documentation of databases, business processes, and software systems. However, it has been found that the correct and meaningful usage of process models appears to be a challenge in practical settings requiring the usage of automat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998